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ABSTRACT 

In  testing  for  the  relation  of  risk  factors  to  a particular  cause  of  death, 
such  as  a rare  disease,  a longitudinal  study  requires  the  observation  of  many 
individuals  for  long  periods  of  time  before  enough  information  has  accrued  to 
permit  reliable  statistical  analysis.  In  the  present  paper,  this  difficulty  is 
circumvented  through  the  use  of  a matched  retrospective  design.  In  particular, 
tests  of  the  hypothesis  of  no  effect  are  obtained  for  the  constant  proportion- 
ality model  and  for  a second  model  in  which  the  risk  factors  are  quantified. 

The  asymptotic  distributions  of  the  test  statistics  are  also  derived. 
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Significance  and  Explanation 

This  paper  is  concerned  with  testing  hypotheses  that  certain  presumed 
risk  factors  significantly  affect  survival.  For  example,  it  is  desired  to 
test  whether  certain  specific  risk  factors  affect  the  mortality  rate  from 
a particular  disease,  such  as  the  relationship  of  exposure  to  polyurethane 
vapors  and  death  due  to  leukemia.  If  the  disease  is  rare,  traditional 
methods  of  investigation  involve  the  observation  of  many  individuals  over 
very  long  periods  of  time  before  enough  mortalities  have  accrued  to  make 
statistical  analysis  feasible. 

To  reduce  the  time  needed  to  acquire  enough  data  for  reliable  statistical  analysis 
a matched  retrospective  experimental  design  is  suggested,  in  which  each 
individual  who  died  at  time  t from  the  disease  under  investigation  is 
matched  with  an  individual  chosen  at  random  from  those  alive  at  time  t . 


The  responsibility  for  the  wording  and  views  expressed  in  this  descriptive  summary 
lies  with  MRC,  and  not  with  the  authors  of  this  report. 
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TESTING  HYPOTHESES  FOR  EFFECTS  ON  SURVIVAL 
BY  THE  ANALYSIS  OF  A MATCHED  RETROSPECTIVE  DESIGN 

Bernard  Harri  s and  Anastasios  A.  Tsiatis 


1.  INTRODUCTION  AND  SUMMARY 

In  this  paper  we  construct  tests  of  hypotheses  for  the  existence  of  effects  on 
survival  due  to  the  presence  of  risk  factors;  such  as  may  be  caused  by  unfavorable  environ- 
mental situations.  Such  problems  arise  naturally  in  the  comparison  of  the  relationship 
between  various  environmental  situations  in  employment  and  the  possible  effect  that  these 
may  have  on  occupational  health  and  safety. 

Traditionally , longitudinal  studies  have  been  enf>loyed  for  this  purpose.;  In  such  a 
study,  risk  factors  are  identified  in  advance  and  individuals  exposed  to  these  risk  factors 
are  observed  for  a predesignated  length  of  time.  Frequently,  such  studies  have  been 
utilized  for  the  purpose  of  identifying  risk  factors  as  causes  of  death  from  a particular 
disease,  such  as  the  relationship  of  exposure  to  polyurethane  vapors  and  death  due  to 
leukemia.  However,  if  the  disease  under  investigation  is  rare,  then  many  individuals  have 
to  be  observed  for  very  long  periods  of  time  before  enough  mortalities  have  accrued  to 
make  statistical  analysis  feasible. 

To  circumvent  this  difficulty,  a matched  retrospective  design  is  proposed.  That  is, 
each  individual  who  died  at  age  t from  the  disease  under  investigation  is  matched  with  an 
individual  chosen  at  random  from  those  alive  at  age  t . We  refer  to  the  individual  who 
died  as  the  case  and  his  matched  counterpart  as  the  control.-  For  each  such  pair,  we  deter- 
mine the  risk  factors  to  which  they  have  been  exposed.  Let  A_.  (t ) , j = 0,l,...,k,  be  the 
hazard  function  for  the  disease  of  interest  for  each  individual  exposed  to  risk  factor  3 .- 
Further  let  iL<t),  j = 0,1, ...,k,  be  the  hazard  function  for  other  causes  of  death  for 
each  individual  exposed  to  risk  factor  3 We  assume  that  for  every  pair  1,3  , 0 <_  i, 
j <_  k,  X^(t)/Xj(t)  = y >0  a constant  (independent  of  age).  In  the  statistical  litera- 
ture, this  is  referred  to  as  the  constant  proportionality  hazards  model  (see  Cox  (1972)). 

Sponsored  by  the  United  States  Army  under  Contract  No.  OAAG29-75-C-0024  and  by  the  National 
Cancer  Institute  under  Grant  No.  IROI  CA  18332. 
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In  section  2,  we  employ  this  model  to  obtain  a test  of  the  hypothesis  y..j  = 1,  0 < i, 

j ^ k and  derive  some  of  its  large  sample  properties.  Section  3 is  devoted  to  the  specific 
case  in  which  the  hazard  functions  satisfy  a relationship  of  the  form  y = exp[B(v  -v. )] 
where  the  v , 0 l <_  k are  known  constants.;  Such  an  assumption  may  he  appropriate 

when  the  risk  factors  can  te  quantitatively  measured,  for  example,  when  individuals  have 
been  exposed  to  specific  levels  of  toxicity.  This  specific  model  has  been  proposed  by 
Cox  (1972). 

2.  TESTS  OF  HYPOTHESES  FOR  DIFFERENCES  IN  MORTALITY  DOE  TO  RISK  FACTORS  IN  A MATCHED 

RETROSPECTIVE  EXPERIMENT 

We  divide  a population  H into  k+1  strata,  , . . . , •'  element  of  the  popula- 
tion will  be  placed  in  stratum  if  it  has  been  exposed  to  risk  factor  j ,,  A particular 

cause  of  death,  such  as  a specific  disease,  will  be  designated  as  the  cause  of  death  of 
interest.;  Data  is  to  be  collected  as  follows.  If  an  individual  dies  from  the  cause  of 
death  of  interest  at  age  t , then  a second  individual  alive  at  age  t will  be  selected  at 
..andom  from  the  population  and  the  stratum  for  each  will  be  recorded.  We  denote  the  hazard 
function  for  the  disease  of  interest  oy 

A_.  (t ) = X (t)expC^  , 3 = 0, . . . ,k  , (2.1) 

and  for  the  other  causes  of  death  by  (t) . With  no  loss  of  generality,  we  can  set  = 0.; 

Using  the  above  data,  we  will  construct  a test  of  the  hypothesis  H : £ = ...  = 5 = 0 , or 

U 1 x 

equivalently,  that  the  hazard  rates  do  not  depend  on  the  risk  factors. 

Let  T denote  the  age  of  death  of  any  individual  m the  study.  Assuming  that  the  sur- 
vival time,  T^ , for  the  disease  of  interest  and  the  survival  time,  T^ , for  other  causes  of 
death  are  stochastically  independent  within  each  stratum,  we  get 

P(T  > 1 1 n . ) = P(min(T  ,T  ) > 1 1 tt  . ) = exp{-/t  [X(x)e  •)+u  (x)]dx). 

3 12  3 0 3 

Let 


(2.2) 


v.  (t)  = lim  P{t  <T1£t+h,  T >t|TT.}/h  . 
-1  h -»  0 3 


(2.3) 


It  will  be  convenient  to  refer  to  v (t)  as  the  competitive  hazard  rate  for  the 
disease  of  interest,  since  T^,  T,,  are  stochastically  independent  within  each  stratum  , 
it  follows  readily  that 
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(2.4) 


v-j  <t)  - Ut)exp£  expf-j*  [X(x)e  3 + u . (x) idx }.: 

3 0 3 


Applying  Bayes*  Theorem,  we  get 

P(V'T1  = ^ T2  " t5  = V]  (tJPCTT^)/  l Vi(t)P(TTi)  = Pl.  (t), 


and 


(2.5) 


P(VT1.  > t(  T2  v ty  ~ P<T  ' t|«  )P(i>  )/  [ P(T  > 1 1 it  )P(n. ) « p < t).  (2.6) 

J J i=>0  1 1 *1 

Employing  (2.2),  (2.4),  (2.5),  (2.6)  we  obtain 

log  (Pl3  <t  >/P10  Ct) ).  - log(F2;|(t)/p20(t))  , j - 1, (2.7) 

independent  of  t . 


Let  (Z 


K 

Zi;)*  i 0 ' 3 = l r.  = 1 , i **  1,2;  £=  1, . . . ,n  . 

j=0 

be  independent  multinomial  random  vectors  with  P(Z.  » J ) = p > g and  T o =1 

13 1 y * ji0  Pi j * 


Then 


Now  let 


P(Z 


2 k n 


jo  t i]  i 


i - i,2;.  j = o,...,k;  e-  i , . . . ,n j = n n n p .f3*  . (2.8) 

i=l  j=0  i=l  13  * 


and 


1 , if  the  nth  case  is  in  r.  , 

] 

0 , otherwise. 


1 , if  1th  control  is  in  it.  , 


0 , otherwise, 

i = 1,2  .... ,n. 

Tliet  if  the  i<mdom  vectors  <Z.o£ 2.^)  are  conditionally  independent  given 


,(1> 


' fci T1  ' = tn  ' where  N 13  the  a<3e  at  d«ath  of  the  ith  case,  the  corresponding 

conditional  likelihood  is  given  by  (2.8)  upon  setting  p.  . (t)  equal  to  p 

*3  ° i j l 

We  denote  this  conditional  likelihood  by 

Mp:2,t)  where  p=  (p.^,  i = 1,2,  j = l,...,k;  fo-1 n), 

* ” *Zig  l'  1 = 3,2;  3 = i = l,...,n)  and  t = (t  ,...,t  ). 

1 n 


(n) 


Then  from  (2.8) 
L (p : Z , t ) = 


n n o „„ 

(-1=1  *=i  l0i 


2 k n 

o n n (p  . /p  ) 


-1=1  3 = 1 fFl 


z..  1 
i3  r 


i 


m -v  2 k n Z.  . . 

= c(P,t)  n n n (P  . /P  ) 131 

*13  r *i0£ 


(2.9) 


1=1  3=1  1=1 
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From  (2.7),  it  follows  that 


~ljt  + Z2jZ)l0q  <P2]  fc/p20£ 


(2.10) 


r k n k n 

O,  'X,  % IT'  n r»  r* 

i.(p:Z,t)  = c(p,t)exp  l Z 1^,+  i l (Zn 

\j=l  3 £=1  13  3=1  £=1  ' 

From  well  known  results  on  the  properties  of  distributions  in  an  exponential  family 

n 

(see  S.  L.  Lehmann  (1959)),  the  joint  distribution  of  £ z,  . „ given  (2,  s Z ), 

£=1  *3*  i3*  2]£ 
j = 1 , . . . ,k;  £=1 , . . . ,n  is  independent  of  (p2  3 l/P2  0 £ ® ‘ This  conditional  distribution  is 

given  in  the  following  theorem. 

Theorem  1:  let  {p  ];  a = l,...,n;  v=  0,...,k  denote  a family  of  multinomial  distnbu- 

tnv  k 


tions  (i.e.,  with  sample  size  unity),  that  is  p 0 


) p 1 , for  a - 1 , . . . ,n.- 

v=0  av 


i),  „ + Z„  . = r , r =0,1 ,2 
l3f  23O 

1.  Then  the  distribu- 


te! m.(r)  be  the  number  of  case-control  pairs  for  which  W 

3 It 

and  let  m , be  the  number  of  pairs  for  which  W = 1 and  W , 

33n  n 3 

tion  of  {(  y T Z. . . ) l(W„.,...,W.  .),  £=1 n}  is  the  distribution  of  the  sum 

10£  “ lk£  1 01  k£ 

j6— 1 v— 1 

of  n independent  multinomial  random  vectors  with  in.  (2)  of  them  satisfying  p^  = 1, 

p = 0,  3 ^ 3 1 , 3 = 0, . . . ,k  and  m_^t  of  them  satisfying  p^  = (1+exp  (? ^ , - ^ ) ) 1 , 

p = (l+exp{£,  -4  ))_1,  p = 0,  j"  yf  j,3',  0 < 3 < j1  < k.  Clearly 

03  3 3 °>3 


j m.(2) 
j=0  3 


C < j < j*  < k 


m . = n 
33 


Proof : For  fixed  £ , 


P{Z10£ 

ziof‘ 

■'zike=  zik?J 

W0m  = W0m ' ’ " 

P{210f. 

= zion"  ‘ 

•'Zlki  = Zik£ 

|W0£  = W0 £ ’ " ’ 

(zi0e' 

' ’ ’ ' Zik  S ^ ' 

i =1 , . . . ,n 

1 = 1,2  are 

km  km 

w = w,  } since  the  randan  vectors 
k£  k£ 


All  events  of  positive  probability  satisfy  either 


(a) 

W =2 
3£ 

- W = 0,  31 

3 >- 

(b) 

V-1' 

W , = 1,  K. 

3 C 3 

since  by  definition 

JQ  Wj£=2- 

3=0  J 
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We  denote  the  events  indicated  by  (a)  by  E (2)  and  the  events  indicated  by  (b)  by 

J *• 


E.](£a,l).  Then 


P{Z10*=Z1CU Zlki=  zikJV2))  = 


1 , if  z =1,  z =0,  3V3 

id  £.  id  '> 

0 , otherwise. 


(2.11) 


and 


r 


p(210{.  Z1(H'  '“'Zlk  i=Zlk£^ E]  j fl  ^ = ' 


pli  lp2i ' l 


Pij  1P2  j 1 H+Pl]  ' iP2j  l 


^ j 1 ” -1 

= (1+e  3 3)  if 


Zlji“1'ZlhiS,0'h?‘j' 


(2.12) 


pli'i.p2]i 


P1  j HP2  j • J.+Pl  j ' 4P2  j t 


0,  otherwise. 


S'S*  -1 

(1+e  J 3 > if 


ZljU=1'Zlhi=°'  W' 


Corollary  1.  betting  T.  = £ Z , the  conditional  means,  variances  and  covariances  of 

3 £^=2.  X;) 

TD  -^ivenby, 


U.(n)  = E (T 


f(n)_5<n) 

. |W0  , . ..  ,W  ; i=l n)=m?n)(2)  + [ (1+e  3‘  3 > _1 , j=o, . . . ,*, 

3 * 3 yjl  j DD 


33  ' j ' 01 'W*=1 n)%LVU+e 


(n)  (n)  r (n)  r(n) 

S'  "S  -1  S si*  -1 

3 3 ) 1 (1+e  3 3 ) . 


) (1+e  J J ) * , (2.13) 

j'^j  JJ  3=0,... >k, 


(n)  (n) 


0-"!  = Cov(T.,T.,  |wot Wkt;fc»l n)  - -mf”)  (1+e  3'  3 )_1(l+e  3 


(n)_p(n) 


33 


33' 


3'  ,-1. 


Theorem  2:  If  nO.  (1+e 

Oj 

then  the  distribution  of 

k , . k k 

(n) , 


- (n)  _ (n) 


, , , . 0 < 3 < 3 ' £ k . 

f (n)  (n)  — — 

^ -1 

) (1+e  J ) + “ as  n + °°  for  all  3 = 1,. . . ,k, 


jI1a3(V*‘jn,)/(;)l1  ^mX  ajV°jj;)2  given  (wot'woi'---  \rwl~x n) 

is  asymptotically  normal  with  mean  zero  and  unit  variance,  whenever  a^,...,ak  are  not  all 


Proof : Defining  = 0,  we  can  write 


I ai(Ti-p+(n>>  = l l ) - 


n k 


(n) 


3=1  333 


£»1  j=0 


3 lj  J2 


where 


-5- 


if 


,<">  = 
4 


, (n ) , (n ) 

, i'  S I'1 


wj£=1-  w3  ■ r 1 - foc  , 

w , a=2  , 

3 

otherwise. 


k ^ (n) 

l«t  X^  = l a.  (Z  p ^ . ),i  =1,2,. ..fn.  Then  the  conclusion  will  follow  from 

j=u  3 3 3 


Liapunov's  theorem, upon  establishing  that  for  some  6 > o , 
^ E(lXfi2  + 6|W0i Wkt,^l,...,n) 

r s 7.  n i+5/2  * 0 


[ l*8’*1*' 


'of"" 


Since  zijj_P£j^  = 0 probability  one  given  VT  „ 


W.  =2  for  some  j and  W.  . =0, 
3*  J 3'i 


j '^j » 0 <_  j , j ' _<  k.  X is  non  zero  if  and  only  if  W.  =1  and  W. , =1 , jj<j ' and 
£ j £ j ° 

W -0,  In  this  case,  for  6 _>  0 

E{lxJ2+6!lwj£=1'wj'i=1'wj“f=0'jVj'j,)  = 

I , |2  + S (n)  (n)  . (n)  i+ 6 , (n),l+6 


where  = (1-te 

33 


tn ) (n ) 

n ’ ~ S -1 

) . Consequently,  we  can  write  (2.14)  as 


c (n)  | i2+6  (n)  , (n),„  (n)l+6  (n),l+6, 

/ m..,  a. -a.,  u . , (l-o>.  )(( u>.  y + (1-u).  ..)  ) 

0 < j < j < k 3 3 33  ’ 33’  33’  33’ 

P V „(n)  | ~ |2  (n)  , (n)  711+ S/2 

3 i'  JJ'  1 

36/2 

- 

which  tends  to  zero  as  n -+  <°  . 

Corollary  2 : If  mVjj/n  a+S*  c,^,,  0 _<  j < j’  £ k , cQ^  >0  for  all  j = l,...,k, 

and  as  n + •>,  j=l,...,k,  then  the  distribution  of  the  random  vector 

_1 

n2  {(Tj-u^"5),.. (Tk-'^n))  },  given  (WQ  £, . . . ,Kfc  £;  f=l, . . . ,n) , 
is  almost  surely  asymptotically  normal  with  mean  zero  and  covariance  matrix 


-6- 


33  ' 


= (o  .j,  n ,3  ' = 1, . . . ,k) , where 

:34  if  j=j'' 

rc33*u3j*tt-B3i')  if  j?,j’  * 


and 


33 


, ■ (mV'V  . 


Proof:  As  a consequence  of  the  multivariate  central  limit  theorem  (See  Rao  (1973)),  it 

suffices  to  show  that  for  all  ^ = (a  .....a^)  ? Jj>  , the  distribution  of 
k . k 5- 

y a . (T  -vi  ) / tn  T a. a given  (Wn0, . . . ,Vt  4=1 , . . . ,n) 

3-l  3 3 3 ^ 3 33  01  ** 

is  asymptotically  N(0,1)-. 


We  can  write 
k 


y a (T  -u 

1 -y-y 


(n) 


3ZL 


3 3 3 


y a (T  -y(n)) 
= 2=L  3 3 3 


1/2  r k 


(n) 


1/2 


'y  l a-a.^.^/n)1/2 

..I  L..XJ  .Jj  . ' 

j y a. a. t a. . t 

L L 3 3j* 


n l a a ,o  , • £ a «.to.  . 

L 3 < 3 ' = 1 3 3 33  J U3.3--1  3 3 33 

By  assumption  o^n!/n  d \s ' o therefore 
33  1 33 

y y a a ,o(n!/n 

3 — 3 ..  3 3 — ‘ 1 , and  the  proof  follows  from  Theorem  2. 

7 y a a , o . , 

LI.  3 3 1 3 3* 

Corollary  3 : let  u^3  - E | ...  11=1 ,n;  5^=0,  j=l,...,k).;  Then 


(n)  (n) 


y30  =m3  <2)  + 3,J3  m3(3'/2- 


If  m^nJ/n  a;b  c ,,  0 < 3 < j'  < k , c , > 0 , j=l,...,k,  and  5fn3tfi 

33  33  _ _ _ 03 

as  n * then  the  distribution  of  the  random  vector 
_1_ 

n { (Tj-p^q3  ),...,  (Tj^-u^"3 ) } given  (WQ ^, . . . .W^^,  ^x,. . . ,n)  is  almost  surely 
asymptotically  normal  with  mean  , where  £ = (t  .t^)  and  covariance  matrix 


■*  T. 

3 


X = (x  . , ) , whe  re 
33 


33' 


j'V3 


E C33"/4  ' 3=j' 


‘c3j'/4  - 3 l*  3'  • 
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Proof : We  can  write  n 2 Ob-p^g3)  = Q^n3  + R.fn3  , j = l,...,k  where  Q.jn3  = n 2 (T^-p.jn3' ) 


and  R, 
3 


(n) 


2 f >>_>> 


n £ (p. 


. (n ) 


3 


p ) . By  Corollary  2,  the  random  vector  (Q^  ,j=l,...,k)  given 


(WQ4 Wk£;i=l,...,n)  is  asymptotically  N(0,X).  Since 

(n)  1 r-  t (n)  , (n) 

I (-P1)  J |(n>  ■')  ) -i- 

3 n L''  ' 2, 


and 


- (n)_  (n)  -i 

Jl+e  j 3 )_1-j  -*  (t.-t.  , 


)/4  as  n -*•  » , then 


(n)  a.s. 


,/4 


I T..c,../4 


jVj  33  3 3 3 jVj  33'  j’/j  3'  33' 


Hence,  the  vector  (R^n3,  j=l,...,k)  ais’Xt 


and  the  conclusion  follows. 

Remark  1:  The  proposed  design  matches  an  individual,  who  died  of  the  disease 

under  investigation  at  age  t with  a random  individual  alive  at  age  t . Let  the  death 

tiroes  t , W,.,.,n  be  independently  distributed  with  density  function  f(t)  and  let  c , 

*■  33 

be  the  unconditional  probability  that  the  case  is  in  stratum  n.  and  the  control  is  in 

3 

stratum  tt  . , ..  Then  from  (2.5)  and  (2.6)  we  get 
3 

Ojj,  * /~  tPXj  <t>P2j>  (t)  + P1;j.  <t)P2  . (t))f  (t)dt  . 

The  marginal  distribution  of  mjj!  is  then  the  multinomial  distribution  with  sample 

size  n and  cell  probabilities  cjj>'  0 3 < 3*  1.  *■'  which  is  the  notation  for  the 

hypotheses  of  Corollaries  2 and  3. 

_1 

By  Corollary  3,  Y = n 2 { (Tj-p^3 ),...,  dy-p^"3 ) } given  W^, . . . .V^s  i=l n is 

asymptotically  N (X£,  j)  = N(X£,X).-  From  the  theory  of  general  linear  models  an  efficient 


test  statistic  for  testing  £ = jj)  is  given  by 
where  £ , the  weighted  least  squares  estimate,  is 


(2.15) 


and 


l = (X>  j^X)"1  X'  |-1Y  = X_lY 
| (T)  - (X'  j"1  X)'1  = x_1  - 


Consequently,  the  statistic  (2.15)  reduces  to  Y'X  3Y.  Under  local  alterantives  this  is 

2 

distributed  by  the  non -central  x distribution  with  k degrees  of  freedom  and  non- 
centrality parameter  ;t'X£  • In  practice,  X has  to  be  replaced  by  its  consistent 


-8- 


estimate  X , 


where 


X = (*..,» x...  = n 1 l if 


. (n) 


33 


'jj* 


jVj 


jj" 


if 

j=j' 

if 

j*j’ 

-1  (n ) . . 

= -h  mj  j'/4  ' 

Therefore,  the  test  of  size  a rejects  HQ:  5^=0, j=l , • . . ,k  , whenever 


where 


m0*  -10  2 
l L l > *a;k, 


fm  <T?'j=1 k)'  Tj  ' I - ”-n)(2)  “ l ■£!/2  . 

J ] W J j<j<j  33 

. (n) 


I * > if 


L 3 (fj j. ; j > j ,-i, • • • ik), 


j>j 


33 


3=3  < 


H?* 


if  j?<j’  , 


and  xa  k is  the  (i-a)th  oercentile  of  the  chi-square  distribution  with  k degrees 
of  freedom. 

Remark  2;  For  the  case  of  two  risk  categories  the  problem  has  been  studied  by  Miettinen 
!) 968^  who  obtained  a test  previously  given  by  McNemar  (1947). 

We  also  note  that  the  test  derived  above  is  identical  to  the  test  for  homogeneity  of 
marginal  distributions  in  a two  way  classification  given  by  Stuart  (1955). 


3.  QUANTITATIVELY  ORDERED  CATEGORIES 

In  some  applications  it  may  be  possible  to  associate  a quantitative  measure  tc  each 
stratum.  For  example,  these  measures  may  be  the  amounts  of  exposure  to  an  environmental 
agent  under  investigation . Let  v^ ,v^  , . . . ,v^  be  the  values  assigned  to  each  of  the  strata. 
Assume  that  the  hazard  functions  for  the  disease  of  interest  and  for  the  other  causes  of 
death  for  individuals  in  stratum  j are  X(t)exp(8vJ  and  (t)  respectively.  This 
model  has  been  proposed  by  D.  R.  Cox  (1972). 

The  null  hypothesis  is  6=0  and  suggests  no  association  between  the  strata  and  death 
due  to  the  disease  of  interest.  With  these  assumptions,  analogously  to  (2.5)  and  (2.6), 
we  get 


-9- 


log  j <t)/P10  (t) ) -log  (p2  j (t)/p2Q  (t) ) = 

independent  of  t . Hence,  analogous  to  (2.10),  we  have 

k n 

^ 'V/  , n p 

L(p:Z,t)  = c (p,t  )exp  (B  I (v  -v  ) ) Z + 

j=l  3 0 1=1  13  * 

ill  Ji  (Zlj«+22j«)l09(P2jil/P2 Cla- 


using well  known  results  on  distributions  in  an  exponential  family,  a UMP  unbiased 

k n 

test  for  H:  6=0  vs  k:6  > 0,  rejects  for  large  values  of  J (v_.-vo>  I 7‘iil  ' 

conditional  on  ( (Z^  ^+Z2  ^) , j=l, . . . ,k;  1=1, . . . ,n) . To  obtain  a large  sample  approximation 


to  the  distribution  of  £ (v.-v  )T.  given  , j=0, . . . ,k;  1=1 , . . . ,n) , we  proceed  as  in 

j=l  3 0 3 -3* 

Section  2,  noting  that  under  HQ 

n2  { *T1-wio>  <Tk*p(kO))  given  (w.  4»  j“0, . . . ,k;  1=1, . . . ,n)  is  asymptotically 

normally  distributed  with  mean  0 and  covariance  matrix  X . Consequently, 

-L  k k 

n 2 ( I v’T  - l v'y*”3}  given  w.  ^ j=0, . . . ,k;  1=1 , . . . ,n 

3=1  3 3 j=l  3 3 3 

is  asymptotically  normal  with  mean  0 and  variance  , where  = (v^-v^, . . . .v^-v^)  • 

We  can  estimate  the  variance  by  v'X^  or 
_1  k k 

n y y (v.-v  ) (v.  ,-v „)x.  . . = 

L '■  J o 3 ' 0 33' 


j=l  j'-/l 


n-r  >-j0  ^•WWV  * X W’  ,4  '1J'’ 


-1  k-1  k 


l l 2 (v  -v  ) (v  -v  )>r  + l l Uv  -v  ) + (v  ,-v  ) )m 

J J j=o  j'=j+l  J J u JJ 


j=0  j '=j+l 


-1  k k 0 

V 1 ^ l ra33'3 

j=0  j'=j+l  33  33 


Under  H 


2 J (v  -v0)(T  -u(^) 

3=1  J 

V-l  k , , 

I I 1 -vJt)  a ) / 

j=0  j'=j+l  3 J 33 


given  (w  {}3=0, . . . ,kj  f=l,. . . ,n)  is  asymptotically  distributed  as  a standard  normal. 

Therefore  the  UMP  unbiased  level  a test  for  Hq-I^C  vs  K:B>0  consists  of  rejecting  HQ 

when  the  test  statistic  (3.1)  is  greater  than  z , where  z is  the  (l-a)th  percentile 

a a 

of  the  standard  normal. 
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